Chapter 7 Feature Description: CHAPTER7.MCD

We'll need to check points are within an image

We'll use some operators from Chapter 3. If you want to write images out or to see them as pictures, you'll need to normalise them:

and we'll often need to clear them

From Chapter 4, we'll start using edge magnitude now, so we need the Sobel operator, and edge magnitude and direction

And here's some of the operators from Chapter 5:

Feature Extraction & Image Processing,

M. S. Nixon and A. S. Aguado

Elsevier/ Butterworth Heinmann, 2nd Edition 2007

We'll need some of the (Fourier transform) operators from Chapter 2:

This worksheet is the companion to Chapter 7 and implements the basic feature description techniques described therein. The worksheet follows the relevant sections of the text directly.

This chapter concerns shape description which is the process by which we generate a set of numbers which describes a shape. We could use the set of points in a shape's border, but this set is usually too large. Also, it will change with position (translation), rotation and scale. We want our description to be invariant to these factors, so that we describe the shape itself.


We'll start with regions and borders, like the text. We'll first need a shape to analyse, let's use the one like

We'll first need routines to check whether a point is connected to any of it's 4 neighbours. We shall here assume that shape points are white (255) and the background is black (0). So for 4-way connectivity, we simply need to check if any of a (white) point's 4 immediate neighbours are also white.

Let's check this for the point at the top left corner of the shape, the point with co-ordinates 1,1

Let's check a point inside the shape

So the function returns 1 when there is no white neighbour, and 0 when there is (this makes the software easier later).

We'll also need to check 8-way connectivity, to assess whether any of a (white) point's immediate neighbours are also white

So let's check again

Note that the point with co-ordinates x=2, y=2 has one non-white neighbour whereas the point with co-ordinates x=2, y=3 is the only white point surrounded entirely by white points.

In order to start the process of finding a shape's boundary, we need to start somewhere. We'll use a routine which scans the image to find the first white point. It actually scans the whole image, keeping the co-ordinates of the first point it found. Clearly, this can be speeded up by breaking the loop(s) when a point is found.

initialise co-ordinates of first point

loop through image

check to see if it's the first white point

if it is, store it

return co-ordinates

Let's check it out, by applying it to the shape just defined:

Let's superimpose it:

That's just right. Let's reset it before we forget:

We can now assess connectivity and find the first point. Having found a point, we need to find if there's a connected point in any of the allowed directions. We will combine 8-way connectivity with 4-way neighbourhood analysis, and 4-way boundary connectivity with 8-way region analysis. We'll do the latter of these first. Essentially, at each point, we look for a new point along the four compass directions. If no point is found then the routine returns co-ordinates for a new point which are the same as originally supplied, allowing the calling software to realise that no new point was found. The order of the search is so as to enforce clockwise analysis (by searching the anticlockwise points first, giving precedence to the clockwise points); the last point searched is towards East. The point at the original co-ordinates is set to a grey level in order to mark the completed boundary.

Let's check this out, starting with the first point:

We can see that the grey point has been marked as the first

point and its connecting neighbour is to its right

So let's do it again:

and let's see how we move on:

So we now have the components necessary to find the points in a boundary. We find a first point, and then keep finding points until we can find no more:

So let's find the border of our shape as

Giving a border

We can also do this by combining 8-way connectivity with 4-way neighbourhood analysis. We have to redefine the earlier four way connectivity to check eight neighbours

we then get a boundary with diagonal lines as

So we can now find the border two different ways: 4-way connectivity gives a blocky effect whereas 8-way connectivity makes the shape to appear more discontinuous. Note that these are just the border which are points on the boundary. We'll now consider boundary descriptions, ways of describing the boundary we have just found. Let's generate a chain code by 4-way connectivity. Each bit of the code is assigned according to the connection between successive points in shape's boundary. In 4-way connectivity, the codes for particular directions are: North=0, South=1, East=2, and West 3. The chain is simply the series of successive code allocations. We'll reuse the earlier code and add in the chain code bits

And we redefine the 4 way analysis to generate the code as

Let's check it out on a new shape,

It is naturally invariant to translation, so we obtain the same code wherever the shape is placed with an image.

So let's try it on a real image:

and then let's show its border

and find its code:

We won't try start point invariance here, as that requires a shift operation (which must be fun in Mathcad!). We won't do rotation or scale invariance either, as that is beyond the scope of this software.

Now we'll look at Fourier descriptors which give a complete description of an object's boundary with the inherent advantages of a Fourier description (access to frequency components etc.).


First, we'll need an initial contour. Again We'll use the function ell_points to generate a set of no points

arranged in an ellipse, centre xc,yc with major and minor axes a and b. We'll use the snake format, including elasticity, curvature and edge-weighting, which we won't use for now, but we'll use it later.

and sometimes we'll want to threshold them

So let's generate an ellipse:

and draw them on an empty image

Now we need to extract a set of complex numbers where the x and y co-ordinates are stored as x+jjy. To give the set, we'll use the function extract

Let's see the set:

So the first point is along the x axis, to the right of the centre, the points continue clockwise, so the second point is just beneath the first.

To generate the Fourier components, we'll need a pointer to all components:

And by using the DFT we get the components as:

So let's have a peep:

Here, the first component is the centre of the ellipse, expressed in complex form. The second component is the radius of a circle which best fits the data. The third component is zero, but remaining ones are not. In graphical form, this is

we now need a point index to the original data

To reconstruct the shape, we can apply the inverse DFT as:

This gives a set of points in complex format. So we'll need to extract them as a contour so that we can use the same drawing algorithm, using the function ext_points

So let's use it

and check we've got back to where we started from:

original image

reconstruction

difference

Yup, it works!

Let's have a quick look at some properties: First, the Fourier descriptors of a line are real: why?

Let's generate a line:

and draw them on the empty image

Let's get the points via the function extract

So our descriptors are:

Now for a circle, the first descriptors are again the centre co-ordinates, the second is the radius (for the ellipse, the second co-ordinate is the average radius).

So let's generate a circle:

and draw them on an empty image

Using the function extract

So our descriptors are:

Yup, right again. Now over to you, ..............why?

Now these descriptors have no invariance properties: if the shape is larger, or is rotated, then the description changes. Let's define a function which gives the descriptors as

And its angular description is defined as

This is actually three descriptions, the angular change, the cumulative and gular change, and the normalised angular function. Only the last of these is useable in shape description. Let's see how they go for a circle:

We'll need a pointer so that we can plot the functions:

Our angular functions are then

This changes as we go round the circle, modulo 0,2p

This changes as we go round the circle from 0 to -2p

This is better, it's normalised around the circle's perimeter.

We can now use the normalised function to give some angular Fourier descriptors. The code is

So for our circle, we get

Note how the noise affects the higher order descriptors

Let's see the difference between these descriptors and those of a real shape. We'll use the Greedy algorithm to find a shape since it can provide a nice set of points describing a border. We'll then use Fourier descriptors to describe the shape, and compare it with those of another. Let's read in the image of the club suit in playing cards

find the edges

normalise them

Now let's define an initial contour

So our initial contour is a circle surrounding the shape. The contour aims to shrink to find it

Let's try one iteration of our Greedy algorithm

The left hand side has moved towards the club; the top has also moved in, but not by as much.

Let's try another iteration

The bottom has nearly reached the club, the left hand and right hand top sections have a bit to go.

This is a fine description indeed

So, let's look at the Fourier descriptors.

Hmmm, these look pretty similar to those of a circle, we'd better see if there is actually any difference.

There certainly is!

Let's compute some elliptic Fourier descriptors. For a contour con, these are defined as

Let's see how they go

The zero order moments again define the ellipse's centre co-ordinates

So the centre information is in ax and ay, and is twice the size (due to the nature of the equations)

Let's have a look at the first order ones::

So ax(1) and by(1) describe the major and minor axes, respectively.

Let's have a look at a large set:

So ax and by only describe the ellipse. You can see the influence of noise in the higher order descriptors bx and by.

So let's do it for the image of a club:

Let's have a look at the descriptors, they are clearly different from those of the ellipse! The descriptors bx certainly have value now.

So let's reconstruct the club's shape:

and the y co-ordinates are

and make the contour as

And look at the difference between the points:

That's close! The largest difference is

That's tiny!

That completes our study of perimeter descriptions. We'll now move to area based descriptions, those which describe regions.

So far, we have looked at methods for perimeter description. There are region descriptors which describe properties of area. We'll start by looking again at an ellipse, so we need a routine which gives an ellipsoidal shaped area. The function draw_ellip does just this, setting to white (255) any point inside an ellipse's boundary.

So let's see it, by drawing it on a black background:

Here it is:

We need a function to turn the points into a vector of point co-ordinates. Extract_shape does just that.

So let's extract it!

The basic area descriptors are common geometrical properties, like area and length of perimeter. The area is simply a count of the points inside a shape and the perimeter length is the number of pixels in it.

So the number of points in the ellipse is

The number of points in the perimeter depends on how the perimeter is extracted. If we extract it by four way analysis, the length will be larger than that for eight way analysis.

That's right!

The compactness of a shape expresses the ratio of perimeter to area as

Naturally, a circle has the most compact shape whereas a line is the least compact. Let's check this

In fact, by theory, the compactness of a circle should be unity, but we are dealing with discretised shapes here. As such, the length of a line is measured as a count of the number of points rather than a pure measure of distance. In fact, that's the main problem as, in discrete images, the length of a line will change with rotation

The irregularity of a shape is defined using the maximum chord length which is the largest distance from the centre to a point on the boundary calculated as

The irregularity is then

So we have

Another measure of irregularity uses the minimum chord length

We can then define the irregularity as

And we get

but these will change with rotation. To avoid that, we could take the ratio of the major chord length to one taken perpendicular to it.

Now, moments are like Fourier descriptors and exist at different scales. So we'll define a set of pointers starting at zero and rising to order 20. We do this for the x axis (p) and for the y axis (q). The moments are defined as

So the moments of our ellipse are

The zero order moment gives the number of points in the shape

(we need to divide it by 255 as that is the brightness of each point in the shape).

When divided by the number of points in the shape and the brightness (i.e. the zero order moment), the first order moments give the x and the y co-ordinates of the centre of the shape

The second order moments are like the moment of inertia in mechanics and they describe a shape's distribution

So our ellipse appears to be distributed more about the y axis (the vertical axis) than it is around the x axis (the horizontal axis). Indeed it is. Let's look at the whole set for the x axis (for which y=0); as they grow in an exponential fashion, we'll plot the log of their value.

So we have something similar to Fourier descriptors: access to a description at differing scales. Unfortunately they will change as soon as we do anything to the shape. It's clear that only the zero order moment is invariant as the first order moments will change even with shift (translation). For this reason, we shall consider centralised moments, which are invariant to translation. We can define these centralised moments m as

The zero order moment is simply reflects the number of points in the area

The first order moments evaluate to zero

The second order moments show distribution

and reflect that the ellipse is distributed more around the y axis, than the x axis.

In order to assess invariance to rotation and translation, let's define rotated and translated versions of the original ellipse

translation

rotation

scaling

original

translated

rotated

scaled

It's no use looking at the first order moments since they equate to zero

So the second order moments are

and

So they don't change with shift, but they rotate with the shape and neither are invariant to scale. That's not too surprising really, since their definition is only phrased in terms of invariance to change in position and that is what we see.

To get better invariance properties, we need normalised central moments h which for p+q>2 are calculated from the centralised moments as

We can then define seven invariant moments from these as

So let's see how they go:

The first order moment is like the average distribution about the two axes

Clearly, this is invariant with rotation, and with scale (the small difference is doubtless due to discretisation effects and is negligible). The second invariant moment is

Again, for an ellipse, this does not vary with rotation, translation or scale, but varies slightly with scale. (Again this is most likely due to discretisation effects. These could be ameliorated by processing shapes of much larger size but that is impractical here.)

For an ellipse, the remaining moments equate to zero.

Let's do it for a real shape, that of a plane. We'll use three silhouettes: one normal, one rotated and the other different

Ah ha, these are black shapes on a white background. Let's redefine our shape extraction operator

and extract the shapes

and let's describe them

Clearly, the set of moments for the B1 bomber is different from the F14 fighter. Also, the rotated and scaled version of the F14 fighter gives very similar values, with small fluctuation due to numerical imprecision. Naturally, these numbers can be used to recognise the shapes. To do that, we need a classifier. That's Chapter 8: enjoy!